AITopics

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

Neural Information Processing SystemsFeb-8-2026, 11:55:36 GMT

NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation Y uanze Wang 1,2 Yichao Y an

Visual localization is a fundamental task in computer vision and robotics.

artificial intelligence, correspondence, machine learning, (15 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsDec-24-2025, 02:23:26 GMT

NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation

Visual localization is a fundamental task in computer vision and robotics. Training existing visual localization methods requires a large number of posed images to generalize to novel views, while state-of-the-art methods generally require dense ground truth 3D labels for supervision. However, acquiring a large number of posed images and dense 3D labels in the real world is challenging and costly. In this paper, we present a novel visual localization method that achieves accurate localization while using only a few posed images compared to other localization methods. To achieve this, we first use a few posed images with coarse pseudo-3D labels provided by NeRF to train a coordinate regression network.

visual localization, visual localization and navigation, visual servo, (8 more...)

Technology: Information Technology > Artificial Intelligence > Robots (0.59)

arXiv.org Artificial IntelligenceJul-29-2025

DeepJIVE: Learning Joint and Individual Variation Explained from Multimodal Data Using Deep Learning

Drexler, Matthew, Risk, Benjamin, Lah, James J, Kundu, Suprateek, Qiu, Deqiang

Conventional multimodal data integration methods provide a comprehensive assessment of the shared or unique structure within each individual data type but suffer from several limitations such as the inability to handle high-dimensional data and identify nonlinear structures. In this paper, we introduce DeepJIVE, a deep-learning approach to performing Joint and Individual Variance Explained (JIVE). We perform mathematical derivation and experimental validations using both synthetic and real-world 1D, 2D, and 3D datasets. Different strategies of achieving the identity and orthogonality constraints for DeepJIVE were explored, resulting in three viable loss functions. We found that DeepJIVE can successfully uncover joint and individual variations of multimodal datasets. Our application of DeepJIVE to the Alzheimer's Disease Neuroimaging Initiative (ADNI) also identified biologically plausible covariation patterns between the amyloid positron emission tomography (PET) and magnetic resonance (MR) images. In conclusion, the proposed DeepJIVE can be a useful tool for multimodal data analysis.

artificial intelligence, deep learning, machine learning, (15 more...)

2507.19682

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2024, 04:17:21 GMT

NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation

Visual localization is a fundamental task in computer vision and robotics. Training existing visual localization methods requires a large number of posed images to generalize to novel views, while state-of-the-art methods generally require dense ground truth 3D labels for supervision. However, acquiring a large number of posed images and dense 3D labels in the real world is challenging and costly. In this paper, we present a novel visual localization method that achieves accurate localization while using only a few posed images compared to other localization methods. To achieve this, we first use a few posed images with coarse pseudo-3D labels provided by NeRF to train a coordinate regression network.

localization method, visual localization and navigation, visual servo, (4 more...)

Technology: Information Technology > Artificial Intelligence > Robots (0.61)

Tran, Nguyen Phuc, Tran, Duy Thanh, Duong, Thi Thuy Nga

Building a temperature forecasting model for the city with the regression neural network (RNN)

arXiv.org Artificial IntelligenceMay-27-2024

In recent years, a study by environmental organizations in the world and Vietnam shows that weather change is quite complex. global warming has become a serious problem in the modern world, which is a concern for scientists. last century, it was difficult to forecast the weather due to missing weather monitoring stations and technological limitations. this made it hard to collect data for building predictive models to make accurate simulations. in Vietnam, research on weather forecast models is a recent development, having only begun around 2000. along with advancements in computer science, mathematical models are being built and applied with machine learning techniques to create more accurate and reliable predictive models. this article will summarize the research and solutions for applying recurrent neural networks to forecast urban temperatures.

algorithm, forecasting model, neural network, (14 more...)

doi: 10.5281/zenodo.6190227

2405.17582

Country:

Asia > Vietnam (0.56)
North America > United States > Maryland (0.04)
Asia > Indonesia > Borneo > Kalimantan > East Kalimantan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJan-11-2024

TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment

Yuan, Jiquan, Cao, Xinyan, Che, Jinming, Wang, Qinyuan, Liang, Sen, Ren, Wei, Lin, Jinlong, Cao, Xixin

Recently, AIGC image quality assessment (AIGCIQA), which aims to assess the quality of AI-generated images (AIGIs) from a human perception perspective, has emerged as a new topic in computer vision. Unlike common image quality assessment tasks where images are derived from original ones distorted by noise, blur, and compression, \textit{etc.}, in AIGCIQA tasks, images are typically generated by generative models using text prompts. Considerable efforts have been made in the past years to advance AIGCIQA. However, most existing AIGCIQA methods regress predicted scores directly from individual generated images, overlooking the information contained in the text prompts of these images. This oversight partially limits the performance of these AIGCIQA methods. To address this issue, we propose a text-image encoder-based regression (TIER) framework. Specifically, we process the generated images and their corresponding text prompts as inputs, utilizing a text encoder and an image encoder to extract features from these text prompts and generated images, respectively. To demonstrate the effectiveness of our proposed TIER method, we conduct extensive experiments on several mainstream AIGCIQA databases, including AGIQA-1K, AGIQA-3K, and AIGCIQA2023. The experimental results indicate that our proposed TIER method generally demonstrates superior performance compared to baseline in most cases.

encoder, text encoder, text prompt, (12 more...)

2401.03854

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-19-2023

Trust, but Verify: Robust Image Segmentation using Deep Learning

Zaman, Fahim Ahmed, Wu, Xiaodong, Xu, Weiyu, Sonka, Milan, Mudumbai, Raghuraman

We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked features in the input image using the segmentation as an input. A well-designed auxiliary network will produce high-quality predictions when the input segmentations are accurate, but will produce low-quality predictions when the segmentations are incorrect. Checking the predictions of such a network with the original image allows us to detect bad segmentations. However, to ensure the verification method is truly robust, we need a method for checking the quality of the predictions that does not itself rely on a black-box neural network. Indeed, we show that previous methods for segmentation evaluation that do use deep neural regression networks are vulnerable to false negatives i.e. can inaccurately label bad segmentations as good. We describe the design of a verification network that avoids such vulnerability and present results to demonstrate its robustness compared to previous methods.

incorrect segmentation, rec-net, segmentation, (16 more...)

2310.16999

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-3-2023

Conditional Instrumental Variable Regression with Representation Learning for Causal Inference

Cheng, Debo, Xu, Ziqi, Li, Jiuyong, Liu, Lin, Liu, Jixue, Le, Thuc Duy

This paper studies the challenging problem of estimating causal effects from observational data, in the presence of unobserved confounders. The two-stage least square (TSLS) method and its variants with a standard instrumental variable (IV) are commonly used to eliminate confounding bias, including the bias caused by unobserved confounders, but they rely on the linearity assumption. Besides, the strict condition of unconfounded instruments posed on a standard IV is too strong to be practical. To address these challenging and practical problems of the standard IV method (linearity assumption and the strict condition), in this paper, we use a conditional IV (CIV) to relax the unconfounded instrument condition of standard IV and propose a non-linear CIV regression with Confounding Balancing Representation Learning, CBRL.CIV, for jointly eliminating the confounding bias from unobserved confounders and balancing the observed confounders, without the linearity assumption. We theoretically demonstrate the soundness of CBRL.CIV. Extensive experiments on synthetic and two real-world datasets show the competitive performance of CBRL.CIV against state-of-the-art IV-based estimators and superiority in dealing with the non-linear situation.

cbrl, estimation error, unobserved confounder, (13 more...)

2310.01865

Country:

North America > Greenland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > South Australia (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Epidemiology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Rahimi, Reyhaneh, Ebtehaj, Ardeshir, Behrangi, Ali, Tan, Jackson

Global Precipitation Nowcasting of Integrated Multi-satellitE Retrievals for GPM: A U-Net Convolutional LSTM Architecture

arXiv.org Artificial IntelligenceJul-20-2023

This paper presents a deep learning architecture for nowcasting of precipitation almost globally every 30 min with a 4-hour lead time. The architecture fuses a U-Net and a convolutional long short-term memory (LSTM) neural network and is trained using data from the Integrated MultisatellitE Retrievals for GPM (IMERG) and a few key precipitation drivers from the Global Forecast System (GFS). The impacts of different training loss functions, including the mean-squared error (regression) and the focal-loss (classification), on the quality of precipitation nowcasts are studied. The results indicate that the regression network performs well in capturing light precipitation (below 1.6 mm/hr), but the classification network can outperform the regression network for nowcasting of precipitation extremes (>8 mm/hr), in terms of the critical success index (CSI).. Using the Wasserstein distance, it is shown that the predicted precipitation by the classification network has a closer class probability distribution to the IMERG than the regression network. It is uncovered that the inclusion of the physical variables can improve precipitation nowcasting, especially at longer lead times in both networks. Taking IMERG as a relative reference, a multi-scale analysis in terms of fractions skill score (FSS), shows that the nowcasting machine remains skillful (FSS > 0.5) at the resolution of 10 km compared to 50 km for GFS. For precipitation rates greater than 4~mm/hr, only the classification network remains FSS-skillful on scales greater than 50 km within a 2-hour lead time.

artificial intelligence, machine learning, precipitation, (17 more...)

2307.10843

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Minnesota (0.04)
Southern Ocean (0.04)
(12 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)